20 April 2018

Drawing

Acknowledgements

Drawing

Encouragment from my boss

“Good luck today. Don’t embarass us.”

Purpose

Provide a broad overview of how the United States Military Academy at West Point uses R to improve cadet performance

  • What is West Point?

  • Who are we?

  • How we use R?

What is West Point?

Overview of West Point

  • 4 Year Academy

  • 3 Pillars
    • Academic
    • Military
    • Physical
  • Graduate
    • Bachelors Degree
    • Second Lieutenant in US Army

Drawing

Who are the faculty?

Drawing

Who am I?

Drawing

Drawing

Drawing

Drawing

Straight Talk From The Frontline

“Being a data scientist is when you learn more and more about more and more, until you know nothing about everything” - Will Curkierski

Doing Data Science
Straight Talk from the Frontline
By Cathy O’Neil, Rachel Schutt

West Point R Heros

Drawing

And many more…

Drawing

At the risk of selling myself short…

Drawing

Drawing

Where does Army Math Fit in the Curriculum?

Drawing

R Fits in Army Math Here

Drawing

How We Use R

  • Course Organization

  • STEM Outreach

  • Teaching Tool

  • Army Decision Making

  • Improving Education / Cadet Experience

Course Organization:
Course Syllabus

STEM Outreach

Teaching Tool:
Mean / Median / Mode

Teaching Tool:
Fantasy Baseball

Teaching Tool:
Central Limit Theorum

West Point Decision Making:
Admissions

Army Decision Making:
Operational Support

Improving Education
Cadet Experience

Problem Statement

Do Students Learn Statistics Better in an Academically Homogeneous Classroom?

Drawing

Drawing

Drawing

Collaborators

Drawing

MA206: Introduction to Probability and Statistics

Overview

  • 500 cadets a semester

  • 17 cadets per classroom

  • 11 Instructors

Experiment Development

  1. Develop model to predict performance

  2. Designate Control / Treatment Group

  3. Execute the semester / experiment

  4. Evaluate results

Model Development

Predict Cadet Performance in MA206

Drawing

Model Building Process

Drawing

Model Building Process

  • Pre Processing
    • Center
    • Scale
    • KNN for 3 Observations
  • Design
    • Repeated Cross Validation
    • 75/25 Split

Linear Regression

set.seed(206)
lm1 <- train(ma206 ~ ., data = dataTrain, 
             method = "lmStepAIC",
             trControl = fitControl)

LASSO

set.seed(206)
lasso <- train(ma206 ~ ., data = dataTrain,
              method = "glmnet",
              trControl = fitControl)

Random Forest

set.seed(206)
randforest <- train(ma206 ~ ., data = dataTrain, 
                     method = "rf", 
                     trControl = fitControl)

Drawing

Ensemble

Drawing

Results

Control vs Treatment

t.test(randomized,ability, mu = 0, alternative = "two.sided")

Drawing

95% CI of difference in means: (-.010,.019)

Results

Pairwise Comparisons

Drawing

How does Ability Group Relate to Future Grades?

mod = lm(FinalGrade~predictedgrade+abilityindicator, data = data)
summary(mod)
Drawing

Enough P Hacking

Drawing